class structure
What We Don't C: Representations for scientific discovery beyond VAEs
Rogers, Brian, Bowles, Micah, Lintott, Chris J., Croft, Steve
Accessing information in learned representations is critical for scientific discovery in high-dimensional domains. We introduce a novel method based on latent flow matching with classifier-free guidance that disentangles latent subspaces by explicitly separating information included in conditioning from information that remains in the residual representation. Across three experiments -- a synthetic 2D Gaussian toy problem, colored MNIST, and the Galaxy10 astronomy dataset -- we show that our method enables access to meaningful features of high dimensional data. Our results highlight a simple yet powerful mechanism for analyzing, controlling, and repurposing latent representations, providing a pathway toward using generative models for scientific exploration of what we don't capture, consider, or catalog.
ObjectRL: An Object-Oriented Reinforcement Learning Codebase
Baykal, Gulcin, Akgรผl, Abdullah, Haussmann, Manuel, Tasdighi, Bahareh, Werge, Nicklas, Wu, Yi-Shan, Kandemir, Melih
ObjectRL is an open-source Python codebase for deep reinforcement learning (RL), designed for research-oriented prototyping with minimal programming effort. Unlike existing codebases, ObjectRL is built on Object-Oriented Programming (OOP) principles, providing a clear structure that simplifies the implementation, modification, and evaluation of new algorithms. ObjectRL lowers the entry barrier for deep RL research by organizing best practices into explicit, clearly separated components, making them easier to understand and adapt. Each algorithmic component is a class with attributes that describe key RL concepts and methods that intuitively reflect their interactions. The class hierarchy closely follows common ontological relationships, enabling data encapsulation, inheritance, and polymorphism, which are core features of OOP. We demonstrate the efficiency of ObjectRL's design through representative use cases that highlight its flexibility and suitability for rapid prototyping. The documentation and source code are available at https://objectrl.readthedocs.io and https://github.com/adinlab/objectrl .
Legal Requirements Translation from Law
Singhal, Anmol, Breaux, Travis
Software systems must comply with legal regulations, which is a resource-intensive task, particularly for small organizations and startups lacking dedicated legal expertise. Extracting metadata from regulations to elicit legal requirements for software is a critical step to ensure compliance. However, it is a cumbersome task due to the length and complex nature of legal text. Although prior work has pursued automated methods for extracting structural and semantic metadata from legal text, key limitations remain: they do not consider the interplay and interrelationships among attributes associated with these metadata types, and they rely on manual labeling or heuristic-driven machine learning, which does not generalize well to new documents. In this paper, we introduce an approach based on textual entailment and in-context learning for automatically generating a canonical representation of legal text, encodable and executable as Python code. Our representation is instantiated from a manually designed Python class structure that serves as a domain-specific metamodel, capturing both structural and semantic legal metadata and their interrelationships. This design choice reduces the need for large, manually labeled datasets and enhances applicability to unseen legislation. We evaluate our approach on 13 U.S. state data breach notification laws, demonstrating that our generated representations pass approximately 89.4% of test cases and achieve a precision and recall of 82.2 and 88.7, respectively.
Review for NeurIPS paper: Steering Distortions to Preserve Classes and Neighbors in Supervised Dimensionality Reduction
Weaknesses: I have the following critical concerns about this paper: 1. The used datasets are too simple. More complex datasets such as SculptFaces in NeRV paper are required to evaluate the performance of the proposed algorithm. As detailed above, ClassNeRV could be seen as a variation of NeRV through penalizing within-class missed neighbors and between-class false neighbors with class information. Therefore, in my opinion, it is not a significant contribution. According to Sec 3.2, they derived the ClassNeRV Stress Function from NeRV Stress Function by splitting Eq. 2 into within-class and between-class relations.
Compressive Feature Learning Robert West Department of Computer Science Department of Computer Science Stanford University
This paper addresses the problem of unsupervised feature learning for text data. Our method is grounded in the principle of minimum description length and uses a dictionary-based compression scheme to extract a succinct feature set. Specifically, our method finds a set of word k-grams that minimizes the cost of reconstructing the text losslessly. We formulate document compression as a binary optimization task and show how to solve it approximately via a sequence of reweighted linear programs that are efficient to solve and parallelizable. As our method is unsupervised, features may be extracted once and subsequently used in a variety of tasks. We demonstrate the performance of these features over a range of scenarios including unsupervised exploratory analysis and supervised text categorization. Our compressed feature space is two orders of magnitude smaller than the full k-gram space and matches the text categorization accuracy achieved in the full feature space. This dimensionality reduction not only results in faster training times, but it can also help elucidate structure in unsupervised learning tasks and reduce the amount of training data necessary for supervised learning.
Do Self-Supervised and Supervised Methods Learn Similar Visual Representations?
Grigg, Tom George, Busbridge, Dan, Ramapuram, Jason, Webb, Russ
Despite the success of a number of recent techniques for visual self-supervised deep learning, there remains limited investigation into the representations that are ultimately learned. By using recent advances in comparing neural representations, we explore in this direction by comparing a constrastive self-supervised algorithm (SimCLR) to supervision for simple image data in a common architecture. We find that the methods learn similar intermediate representations through dissimilar means, and that the representations diverge rapidly in the final few layers. We investigate this divergence, finding that it is caused by these layers strongly fitting to the distinct learning objectives. We also find that SimCLR's objective implicitly fits the supervised objective in intermediate layers, but that the reverse is not true. Our work particularly highlights the importance of the learned intermediate representations, and raises important questions for auxiliary task design.
Unsupervised Representation Learning by Predicting Random Distances
Wang, Hu, Pang, Guansong, Shen, Chunhua, Ma, Congbo
Deep neural networks have gained tremendous success in a broad range of machine learning tasks due to its remarkable capability to learn semantic-rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption into unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications into critical domains where obtaining massive labelled data is prohibitively expensive. To enable downstream unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space. Random mapping is a theoretical proven approach to obtain approximately preserved distances. To well predict these random distances, the representation learner is optimised to learn genuine class structures that are implicitly embedded in the randomly projected space. Experimental results on 19 real-world datasets show our learned representations substantially outperform state-of-the-art competing methods in both anomaly detection and clustering tasks. Unsupervised representation learning aims at automatically extracting expressive feature representations from data without any manually labelled data. Due to the remarkable capability to learn semantic-rich features, deep neural networks have been becoming one widely-used technique to empower a broad range of machine learning tasks. One main issue with these deep learning techniques is that a massive amount of labelled data is typically required to successfully learn these expressive features. As a result, their transformation power is largely reduced for tasks that are unsupervised in nature, such as anomaly detection and clustering. This is also true to critical domains, such as healthcare and fintech, where collecting massive labelled data is prohibitively expensive and/or is impossible to scale. To bridge this gap, in this work we explore fully unsupervised representation learning techniques to enable downstream unsupervised learning methods on those critical domains. In recent years, many unsupervised representation learning methods (Mikolov et al., 2013a; Le & Mikolov, 2014; Misra et al., 2016; Lee et al., 2017; Gidaris et al., 2018) have been introduced, of which most are self-supervised approaches that formulate the problem as an annotation free pretext task.
Evaluation of Multi-Slice Inputs to Convolutional Neural Networks for Medical Image Segmentation
Vu, Minh H., Grimbergen, Guus, Nyholm, Tufve, Lรถfstedt, Tommy
-- When using Convolutional Neural Networks (CNNs) for segmentation of organs and lesions in medical images, the conventional approach is to work with inputs and outputs either as single slice (2D) or whole volumes (3D). One common alternative, in this study denoted as pseudo-3D, is to use a stack of adjacent slices as input and produce a prediction for at least the central slice. This approach gives the network the possibility to capture 3D spatial information, with only a minor additional computational cost. In this study, we systematically evaluate the segmentation performance and computational costs of this pseudo-3D approach as a function of the number of input slices, and compare the results to conventional end-to-end 2D and 3D CNNs. The standard pseudo-3D method regards the neighboring slices as multiple input image channels. We additionally evaluate a simple approach where the input stack is a volumetric input that is repeatably convolved in 3D to obtain a 2D feature map. This 2D map is in turn fed into a standard 2D network. We conducted experiments using two different CNN backbone architectures and on five diverse data sets covering different anatomical regions, imaging modalities, and segmentation tasks. We found that while both pseudo-3D methods can process a large number of slices at once and still be computationally much more efficient than fully 3D CNNs, a significant improvement over a regular 2D CNN was only observed for one of the five data sets. An analysis of the structural properties of the segmentation masks revealed no relations to the segmentation performance with respect to the number of input slices. The conclusion is therefore that in the general case, multi-slice inputs appear to not significantly improve segmentation results over using 2D or 3D CNNs. Segmentation of organs and pathologies are common activities for radiologists and routine work for radiation oncologists. Nowadays manual annotation of such regions of interest is aided by various software toolkits for image enhancement, automated contouring, and structure analysis in all fields on image-guided radiotherapy [1, 2, 3]. Over the recent years, deep learning (DL) has emerged as a very powerful concept in the field of medical image analysis.
Machine Learning, Part 3: Unsupervised Learning
Clustering is sometimes called "unsupervised classification", a term that I have mixed feelings on for reasons I will cover shortly, but it provides a good enough explanation of the problem to be worth covering. First, the problem is unsupervised -- we won't have a labeled dataset to guide our logic. Secondly we are looking to separate items into classes based on the predictors (technically they are not predictors they are "features" here because there is no response). The difference is that in supervised classification the class structure is known and labeled, whereas in clustering we are inventing the class structure from the feature values alone. In supervised classification we used the labels to single out one class and looked for predictors that had two qualities: 1) They had fairly common values for every example of that class and 2) they separated that class from others.
Stochastic Incremental Algorithms for Optimal Transport with SON Regularizer
Panahi, Ashkan, Thiel, Erik, Cheraghani, Morteza H., Dubhashi, Devdatt
We introduce a new regularizer for optimal transport (OT) which is tailored to better preserve the class structure. We give the first theoretical guarantees for an OT scheme that respects class structure. We give an accelerated proximal--projection scheme for this formulation with the proximal operator in closed form to give a highly scalable algorithm for computing optimal transport plans. We give a novel argument for the uniqueness of the optimum even in the absence of strong convexity. Our experiments show that the new regularizer preserves class structure better and is more robust compared to previous regularizers.